A New Clustering Algorithm On Nominal Data Sets
نویسنده
چکیده
This paper presents a new clustering technique named as the Olary algorithm, which is suitable to cluster nominal data sets. This algorithm uses a new code with the name of the Olary code to transform nominal attributes into integer ones through a process named as the Olary transformation. The number of integer attributes we get through the Olary transformation is usually different from that of the original nominal attributes. Meanwhile, an extension of the Olary algorithm, which we call the ex-Olary algorithm, is introduced. Furthermore, we provide a useful way to estimate the number of underlying clusters by the use of a new kind of diagram, which is called Number of Clusters versus Distance Diagram (NCDD for short).
منابع مشابه
Solving Data Clustering Problems using Chaos Embedded Cat Swarm Optimization
In this paper, a new method is proposed for solving the data clustering problem using Cat Swarm Optimization (CSO) algorithm based on chaotic behavior. The problem of data clustering is an important section in the field of the data mining, which has always been noted by researchers and experts in data mining for its numerous applications in solving real-world problems. The CSO algorithm is one ...
متن کاملImprovement of density-based clustering algorithm using modifying the density definitions and input parameter
Clustering is one of the main tasks in data mining, which means grouping similar samples. In general, there is a wide variety of clustering algorithms. One of these categories is density-based clustering. Various algorithms have been proposed for this method; one of the most widely used algorithms called DBSCAN. DBSCAN can identify clusters of different shapes in the dataset and automatically i...
متن کاملSolving Data Clustering Problems using Chaos Embedded Cat Swarm Optimization
In this paper, a new method is proposed for solving the data clustering problem using Cat Swarm Optimization (CSO) algorithm based on chaotic behavior. The problem of data clustering is an important section in the field of the data mining, which has always been noted by researchers and experts in data mining for its numerous applications in solving real-world problems. The CSO algorithm is one ...
متن کاملخوشهبندی خودکار دادههای مختلط با استفاده از الگوریتم ژنتیک
In the real world clustering problems, it is often encountered to perform cluster analysis on data sets with mixed numeric and categorical values. However, most existing clustering algorithms are only efficient for the numeric data rather than the mixed data set. In addition, traditional methods, for example, the K-means algorithm, usually ask the user to provide the number of clusters. In this...
متن کاملAn Incremental DC Algorithm for the Minimum Sum-of-Squares Clustering
Here, an algorithm is presented for solving the minimum sum-of-squares clustering problems using their difference of convex representations. The proposed algorithm is based on an incremental approach and applies the well known DC algorithm at each iteration. The proposed algorithm is tested and compared with other clustering algorithms using large real world data sets.
متن کامل